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METHOD FOR IDENTIFYING STRUCTURALLY ACTIVE COMPOUNDS 
USING CONFORMATIONAL MEMORIES 

Introduction 

The present invention relates t for 

predicting toe eonf onaa t ion and functionality of a 

molecule, comprising the steps of, first, performing 

multiple simulated annealing runs in order to reveal 

populated and unpopulat^ reg 

conformation space, and, second, ^performing a simula- 
tion at a fixed temperature, with sampling only from 
populated regions found in tlie f ir st step . 

Backgroun d of the Invention 

The insights gained from simulations , and the 
growing prevalence of relatively inexpensive computer 
power , has led to the widespread u many computa- 

tlonal techniques. The succiesse^ o methods has 

5 prompted the continuing development of new methods to 
study more and more complex ^ Early chemical 

simulations, for example, were used to estimate equili- 
brium statistical mechanical quantities and transport 
properties on collections of point particles whose 
10 interactions were governed by si^ (Alder 
and Wainwright, Phys. Rev y Lett., 1ft: 988 (1967) ; Hoover 
and Ree, J. Chem Phys. 12:3609 (1968) ; Rahman, Phys. 
Rev. 136:A405 (1964); Verlety P^ 

(1967) ) . Today, simulations are usM ; to compute free 
15 energies (Jorgensen, Ace. Chem; R^s. 2^:184 (1989); 
Kollman, Chem. Rev. 22: 2395 (1993) j and to study 
complex systems like protein stri^ur^e (McCammon and 
HArv ft y f in Dynamics of Proteins and Nucleic Acids, 
Cambridge University Press, New York, N^Y. (1987) ) . 
20 Increasing demands naturally lead to increasing dif- 
ficulties. One of the great difficulties in compu- 
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tational chemistry is the simulation of flexible 
organic and bioiogicai molecules. These systems are 
problematic because they belong to the general class of 
problems known as multiple time; scale problems 
(Brackbill nni * A h~" , < " M"1 f ^ P 1 » Tiine Scales .. Academic 
Press, Orlando Flofeidai ( 1985) ) . Flexible organic 
molecules belong to this class because bond ^.ength and 
bond angle motion occurs on a femtosecond to picosecond 
time scale, while torsional motion may occur on a nano- 
second time scale or longer. Since true convergence of 
statistical mechanical properties requires multiple 
inter conversions between all torsional states, 
obtaining stable averages may require simulations on 
the order of tens or veyen hundreds of nanoseconds. 
15 Typically, chemical systems are simulated 

using Monte Carlo ('•MC'' Metropolis et al. , J. Chem 
Phys. 21:187 (1953) ) or molecular dynamics ("MD»; Allen 
and "»<™»«-i-y, in Computer simulations of Liquids / 
(Clarendon, Oxford, (1987) ) . The recognition that the 
26 study of complex systems may overtax these standard 

methods has led to mteh work on the development of more 
powerful techniques. Several different variations and 
extensions of these two basic procedures have been 
tried in an attempt to find more efficient methods. 
25 These new algorithms generally fall into two broad 

classes: MD methods with the addition of some random 
character, (Anderson, J. Chem Phys. 22:2384 (1980); van 
Gunsteren and ^rehdsen, Mol. Sim. , 1:173 (1988) ) and 
MC methods which utilize some partial deterministic 
30 character to generate better trial moves (Brass et al., 
Biopol., 22: 1307 (1993)1 Heerman et al., Comp. Phys 
Comm. , £fi:311 (1990) ; Cao and Berne> J. Chem Phys., 
92:1980 (1990) ; Rao and Berne , J • Chem Phys . , 21:129 
(1979) ; Rossky et al ^ ; J • ?Chem phys . ,. £9_:4628 (1978) ) . 
35 The most recent algbrithm, the mixed MC-SD algorithm, 
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is a pure hybrid that uses MP and MG methods equally 
(Burger et al. , J. Amer. Chem. Soc . , 116: 8 # 3593) . 

In the MC method, increases in efficiency may be 
obtained by generating noriuni^ random trial moves. 
5 If this is done over the same surface being studied, it 
is known as biased sampl ing . If the searching is car- 
ried out over a potential surface that is different 
from the surface being :studied> it is characterized as 
importance sampling/ Since importance sampling is done 

ib over a different surf ace tl^ the one actually being 
studied, it is necessary to appropriately weight the 
statistical mechanical averages (Kalos and Whitlock , 
Mcmte Cirlc> Mettbds vol 1 , (Wiley , New York, H-Y. 
1986}) • In bias^ sampling , since the searching is 

15 beting done over the same potential surface as the 

actual surf ace under study, no weighting of the aver- 
ages is necessary . In biased sampling, MC trial moves 
art based on some a prldri knowledge of the space to be 
sampled; Regions that are more "important" are sampled 

20 With greater frequency in the full expectation that the 
speed of convergence will be enhanced. In practice , a 
simulation employing biased sampling is done in two 
steps: some initial procedure is employed to reveal 
y^^.^^iii^\^^: important cpross features , then an exten- 

25; sive search utilizing ttis information is performed. 

Obviously, this procedure will be valid and useful only 
if it is faster than a standard sampling procedure, and 
if it introduces no spurious artifacts i 

30 SUMMARY OF THE IITOBHTIOH 

The present invention relates to a method for 
identifying structurally active molecules comprising 
the steps of , first, performing multiple simulated 
annealing runs in order to reveal populated and unpop- 
35 ulated regions of multidimensional conformation space, 
and, second, performing a simulation at a fixed tern- 
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perature, with sampling only from populated regions 
found in the first step. It is based, at least in 
part* on the discovery that the method of the invention 

V could be used to sort a large family of analogs of 

5 gonadotrophin releasing hormone ("GnRH analogs") into 
groups haying low or high affinity for the GnRH recep- 
tor. The method of the present invention offers the 
advantage that, since the simulated annealing runs 
quickly reveal unpopulated regions of *he conformation 
:^oV': space, the volume of conformation space that needs to 
be sampled in the second phase of the algorithm is 
reduced by many orders of magnitude. Additionally, 
since no energy minimization is used, these populations 

• ■ represent a canonical ensemble which may be used to 

15 estimate conformational free energies. 

p-« ?r 4pfcion of the Flexures 

Figure 1. structure of LTB 4 . 

Figure 2. A Flex-Map or "Conformational Memory" 

20 of dihedral 1 from LTB 4 . 

Figure d. Graphical Representation of the Mapping 
of Rand Numbers onto the dihedral distribution . Random 
numbers between 0-1 determine dihedral values (using 
tne ---- line) . For example, 0.6 maps to +65°. 
15 Figure 4 . Plot of the average LTB 4 conf ormer 

energy vs temperature for ten normal SA runs and ten 
Smart-SA runs. Very rapid energy lowering is possible 
using Smart-SA, although the ultimate energy is 
similar. Each 10 run set required -16 hrs of CPU time 
on a Vax 8600 and represents 100,000 conformers each. 

Figure 5. Molecular structure of the 
gonadotrbpin-releasing hormone (GnRH) . The 35 
rptatable torsional angles are indicated by arrows. 
• Figure 6. Conformational Memories of selected 
35 dihedral angles in the gonadotropin-releasing hormone 
(see Fig. g*f or identity of the angles, a. Dihedral 
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angle 4; b. Dihedral angle 6; c. Dihedral angle 7; d. 
Dihedral angle 35. 

Figure 7 . Conformational Memory dif if erentfe maps 
Of dihedral angle 20 in GnRH (see Fig. 5 j; The dif - 
5 ference maps were created by subtracting the Conforma- 

tional Memories from A. 25; and 10 runs > B. ,i-50.. ind 25 
f runs, C. 75 and 50 runs, D. 100 and 75 runs. Note that 
.' V. ^ilhr' almost all regions dif ferences are <1%. Confbiniia- 
'tional Memory Difference Maps of the pthfer dihedrals 
are very similar. 

Figure 8 . A sequences of 3 lok temperature slices 
from the Conformational Memory of bond 17 , calculated 
-: : ./; ; ;:;V ; ';x;>fith'.a. 25 runs; b. 50 runs ; c . 75 runs; d. 100 runs . 
Note the symmetrically equivalent population distri- 
15 but ions centered about -90 and 90. 

Figure 9. The choice of dihedral angle values in 
biased sampling of the populated region of the 
Conformational Memories. The illustration lis for 
d^ angle 19. Panel (a) shows a histogram rep^e- 

20 seritation of the probability distribution for the 
dihedral angle^panel (b) shows the cumulative 
probability distribution for dihedral angle 19. Sihc^ 
the random number generator is a cumu 1 a t i ve prpbabi 1 ity 
distribution , biased sampling is done from the histc*- 
; ?5 gram in part ( b) . If the random number 0 . 2 is 

generated, which corresponds to the second block of the 
histogram in part (b) : , the new trial dihedral will be 
chosen from the interval -170 to -160 degrees with the 
; actual value obtained from a linear ihte^olatiori 
30 within this interval. If the random number 0 . 4 is 

generated, which corresponds to the 28th block of the 
histogram in part (b) , the new trial dihedral will be 
chosen from the interval 90 to 100 degrees. Note that 
the region -60 and 60, which has no population in part 
35 (a) , is automatically skipped when sampling from part 

fb). 



WO 96/34347 



-6- 



PCT/US96/06U0 



Figure 10. Backbone trace of a representative of 
the five conformational families of GnRH obtained from 
Conformational Memories. Structures with a beta-type 
turn have a 70% population. Structures with a straight 
5 backbone have approximately 5% population. 

Figure 11. Superimposition of 70 structures that 
make up the major conformational family of GnRH 
obtained from Conformational memories. While there is 
a large amount of fluctuation in the backbone, and an 
10 even greater amount of fluctuation in the side chains 

(especially Arg8) , there is a clear beta-type turn from 
residues 5-8 in this family. 

Figure 12. Backbone trace of a representative of 
the two conformational families of Lys8-GnRH obtained 
15 from Conformational Memories. The structure with the 
beta-type turn comes from a family with an approxi- 
mately 3% population. The structure with the straight 
backbone comes from a family with approximately that of 
70% population. 

20 Figure 13. Superimposition of 70 structures that 

make up the major conformational family of Lys8-GnRH 
Obtained from Conformational Memories. While there is 
a large amount of fluctuation in the backbone, and an 
e V en greater amount of fluctuation in the side chains 
25 (especially Lys8) , the backbone is clearly extended. 

Figure 14. Superimposition of a high affinity 
GnRH cyclic analog (Structure I) and a representative 
of the major GnRH conformational family (structure II) . 
Eleven backbone atoms from residues 5-8 were used for 
30 the superimposition . 

polled D*«orlntion of the invention 

For purposes of clarity of description, and not by 
way of limitation , the present invention is described 
by way of two examples. First, the method of the 
35 invention is applied to determining the structure of 
leukotrienes. Second, the method of the invention is 
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used to identify GnRH analogs which have a high 
affinity of binding to the GnRH receptor* 

EKawler Peter^ 
5 Leukotrienes, for example , are an important class 

of natural antiinflammatory agents (Sammuelsson et al . , 
Prostaglandins, 12:785 (1979); M The Leukotrienes" : 
Their Biological Significance, w P. J. Piper, Ed. r Raven 
Press, N.Y. , (1986) ) . Understanding the bioactive con- 
10 formations of a key member of this class such as LTB 4 
(Figure 1) , involves coriforinatio^l 
flexible dihedrals. 

This is, however, an extremely difficult problem. 
For a description of "Impossible" computational pro- 
15 blems see: W. Gar ey, Computers and Intractability, (H . 
Freeman and Co. , New York, N.Y. 1979) . Even con- 
sidering only a three state model around feacftb^ 
(anti and +/- gauche) there are 3 14 possible con- 
formations (Kirkpatr ick , e t al., Science, 220 :671 
20 (1983) ; Simulated Annealing and Opt imi z a t i bri , M i Jfi; 

Johnson , Ed. , American Sciences Press , Syracuse f N> Y • 
(1988)) (1,594, 323 ) # A recent case study on the 
conformational analysis of cycloheptadecane (Saunders 
et al. , J. Amer. Chem. Soc. 112 :1419 (1990)) which 
25 state is equivalent to a 12 -dimensional nonsymmetric 
problem, nicely illustrates the difficulties of 
searching a multidimensional conformational spade. 

gmart-Simttlatgd anpsajJijag^ 

30 To address this problem we have developed a cpn- 

formational analysis technique which combines simulated 
annealing (SA) (Kirkpatr ick, et al. , Science, 2M; 671 
(1983); Simulated Annealing and Optimization, M.W. 
Johnson, Ed. , American Sciences Press, Syracuse, N.Y. 

35 (1988) ) ; Wilson et al. , Tetrahedron Letters, 4343 
(1988); Wilson et al. , Proceedings of the Seventh 
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Workshop on vitamin D, de Grw^et Co. (1^88) ; 

Wilson and Cui, Biopolymetkr 21-^ i^^) I Wilson >t 
^ al.^ Cbnp. . Chem. / 12^13 , 3^ (19?1)) and glased 
sampling (Kalos and Whitlbck, M<>nte Carlo Methods vol 
5 l, Wiley, New York, N.Y. 198$) into a type of learning 
algorithm (Caudill, Expert, 12/89, 4/90, 6/90; Judd, in 

t M ^w»t>v De^m, and the complexity of Learning , 
MtT Press, Cambridge, MA ( 1990) ; statistical Mechanics 
' - fi^ r f r Metworks ; Luis Garrido, Ed, Springer-Verlag 
10 New York, N.Y. (1990); Lacey, Ed., Neural Networks , 

tetrahedron Computer Methodology , 1990, 3). The method 
is a 2-stage process made up of a learning phase and an 
implementation phase. The learning phase starts by 
randomly sampling the dihedral space of all flexible 
15 bonds using the simulated annealing algorithm 

(kirkpatrick et al. , Science , 22£:*71 (1983) ; Simulated 
f^ry*** 4 and optimisation. M.W. Johnson, Ed> , 
(American Sciences Press, Syracuse; N.Y. (1988)) .The 
entire 360° continuous dihedral space of all flexible 
20 torsional angles is sampled in accordance with the 

fundamental hypothesis of equal a priori probabilities 
(Tolman, in Princip le gj statistical Mechanics, 
(Dover Press, New York (1971) ) . To provide our 
knowledge-base, multiple SA runs are performed and for 
25 each step, the chosen dihedral, jvalue of the <?hps^n 
dihedral and conformation energy at that step are 
■ recorded (F . Guarnieri, PhvD. , Thesis , New York 

University (1992)). This series of log files is con- 
verted into population distributions by summing and/or 
30 averaging the number of hits in ten degree intervals . 
The conformation space of the antiinf la^tbry agent 
LTB 4 , which has fourteen rotatable dihedral angles, 
gives us the Flex-Map (Wilson and Guarnierl> Tetra- 
hedron Lett. , 22:3601 (1991)) plots of these fourteen 
35 bonds. One typical Flex-Map is shown in Figure 2. 
These plots contain information of the overall 
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pojpuiatibn d bond as a 

function *>f temperature. : Since the whole molecule is 
in flux with all energetic interactions taken into 
accord at every step, the Flex-Maps are mean field 
5 population distributions with 

Hence, they are a true canonical ensemble with; respect 
to all flexible torsional states of the molecule. 
These maps rapidly reveal occupied regions of dihedral 
space and "dead zones 99 which are totally die vbi^ of 

10 conformations at any temperature. These "dead zones' 1 
are the key to why it is so difficult to search the 
conformation space of flexible molecules with many 
rotatable torsionals. Most methods sample from the 
whole space throughout a conformation search. Clearly 

15 these dihedral distributions, which we now call con- 
formational memories . indicate t&afe sauapl ihq from many 
regions is a complete waste of time. 

. ; j - -:Mk ' 'lis- self-evident ■ that it^ ^ : wili;;|iie- yaistly mbre ' • . 
efficient to sample from the smaller space obtained 

20 from the e liminat ion of : •" dead z one s " compared to 

sampling the original space. The key point is to make 
sure that thermally accessible regions are not erron- 
eously labeled as "dead zones" because the second phase 
of the simulation would be flawed. Thus, in the first 

25 part of the simulation , cate insure a 

good sampling . This is why repeated simulated 
annealing runs are performed in the initial phase. 
Multiple runs 

u^ and large Monte Carlo steps 

30 have the best chance of sampling in every region. We 
would like to point out that a comparable simulated 
annealing strategy was shown to be capable of searching 
the entire conformation space of cycloheptadecane ; 
(Wilson and Guarnieri, Tetrahedron Lett. , 32:3601 
35 (1991) ; Guarnieri and Wilson V Tetrahiedrbn , 48:4271 

( 1992 ) ; Guarnieri et al. , J. Ghem SbCv, Gh(^i Goto;, 
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21:1542 (1991)) in less than 48 tes / on ; a microvax . 
in contrast, the aforementioned effort (Saunders et 
! : :.il>> ^ im^, c^l $6c r , iii:i4l9 (1990):) .;■ using most 
known search methods took about 2 CPU years on a 
5 microvax. For more complicated systems such as LTB 4 , 
we are confident that compilations of repeated runs 
started from different configurations, using different 
random number seeds, and initialized with a thermal 
energy of over 1000K, reveals the populated and 
10 unpopulated regions of the 14 dimensional torsional 
space of LTB 4 . In fact, the unpopulated regions are 
revealed particularly early on in the simulation. In 
compiling 5, 10, 15 and 20 runs, the radios of the 
populated regions change (to a very small degree in 
15 going from 15 to 20 runs), but there is virtually no 
change in unpopulated regions throughout this 
progression. For LTB 4 the unpopulated regions make up 
more than half of the total conformational space at 
200K. A sampling strategy which avoids these "dead 
20 zones- would reduce the volume of conformation space 

that needs to be searched from 3 60* 4 to less than 180 14 , 
where 14 is the dimensionality of the space, and 360 is 
the extent of one dimension. 

fiTn» y t-Slmiil»fc ft d Ann e xing: The Implementation Phage 
25 The implementation phase involves utilizing the 

information contained in the 14 conformational 
memories, an example of which is shown in figure 2. 
This is done by again running the SA Metropolis 
algorithm, but instead of selecting new trial con- 
30 formations at random over the whole dihedral circle, we 
select new trial conformations by sampling only from 
populated rfsgibhs of the conf ormatiphal mempry f °* e * ch 
bond at a given temperature. To search for iow energy 
conformations, the populations at 200K were chosen. We 
35 call this technigue of biased sampling with simulated 
annealing Smart-SA. 



:4;|$W' procedure is needed to sample a dihedral 
space embedded with "dead zones." In order to carry 
out this biased sampling , it is nfecessary ■ to map . the-.' 
uniformly distributed random numbers produced by 
5 standard random number generators onto the con- 

f oinM memories .. This process is illustrated in 

figure 3. The conformational memory 

approximately by a classic three state model* Figure 3 
shows mapping of random iiub^ 
10 To perform this mapping, our algorithm requires infor- 
mation on the number states, the interval and popula- 
tion of each state. (In 

states is actually f our instead of three beca^ 
dihedral space goes from -180 to +180). The shaded 

15 regions are the "dead zones?:, and t 

first region has a 1/3 probability 
of being surveyed, if the g<erieirated random number is 
••ttetw^h; /jziard 'and' 1/3, the iiiew : ':d ; ihe selected from 

this first region . The exact value that this bond will 

20 be set to is obtained by starting at the point on the 
ordinate at the value of random number , moving horizon- 
tally until the dotted lin^ i^ a vertifeiai 
from that point, and selecting the dihedral value that 
irisep;- f ^ vertical wl^th the : 

25 abscissa . For example, a prbbablli^ maps to a 

v dihedral Value of 65% This angle ^is passed to the 
dit^dr al d^ to construcrt the new corif ormat ion . 
Oripe ^i s conformation is created , the aigor i thm pas ses 
;ba^ 

30 (klrfc^ al . , ScieiK5e> ^^z (0^ (i9BJ^ Simulated 

Annea 1 tho arid Pot imiz at ion M.W. Johnson, Ifl. , American 
'■Sciences: Press, Syracuse, N*Y. (1988) ; Wilson et al . , 
Tetrah^ciron Letters, 4343 (1988) ; Wilson et al. , 
Proceedings of the Seventh Workshop oh Vitamin D, 

35 Waiter de Gruyter Go . , (1988) ; Wilson and Cui , Biopoly- 
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mere, 2£: 225 (1990); Wilson et al., J. Comp. Chem. 12, 
3, 342 (1991) > 

Results \p 

■':5: Th^ LTB 4 problem above was run using the same SA • '.' 

control data as previously reported (Kirkpatrick et 
al.y Science, 22fl:67l ( r simulated Annealing and 
o ptimization . M.W. Johnson, Ed., (American Sciences 
Press, Syracuse, N.Y. (1988)). Ten runs of SA and 
10 Smart-SA were carried out on LTB 4 (500 steps at 25 

temperatures) . The convergence results are shown in 
figiare 4. Much faster lowering of the conformer energy 
occurs producing rapid convergence. 

On a system as complicated as LTB 4 it is impos- 
15 sible to prove that a conformation search is complete. 
One usual measure of comprehensiveness is to perform 
repeated searches using different initial conditions 
until the output of several searches produce the same 
results. Ten ordinary simulated annealing runs on LTB 4 
20 produced ten different low energy conformations. Ten 

runs of Smart-SA with the conformational memories taken 
at 200K produced two conformations (6 of one and 4 of 
the other) which were both lower in energy than any of 
the ten conformational found by ordinary simulated 

..25 annealing. 

Computational Details 

Since many of the computational details have been 
reported, (Wilson et al.. Tetrahedron Letters, 4343 
(1988); Wilson et al., Proceedings of the Seventh 
30 Workshop on Vitamin D, waiter de Gruyter Co. , (1988); 
Wilson and Cui, Biopolymers, 22:225 (1990); Wilson et 
al. , J. Comp. Chem. 12, 3; 342 (1991) ; F. Guarnieri, 
Ph D Thesis, New York University, 1992 ; Wilson and 
GuaCTieri> T<etrahdron Lett. ^ pniy a 

35 brief summary will be outlirtelly- .S^j^^^lf^^f^: 
with beta=0.11. This corresponds to a temperature of 
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i093K ; g where R=8 ; 314e3 kJ/mpi . 

.-;After : " every block of steps beta is multiplied by 1 . 1 to 
reduce the temperature. In theory, the cooling: sched- 
ule should be controlled and varied as a fuhctibri; of 

.5 the heat capacity. In practice, we have found that 
balancing the need for slow cooling and obtaining 
acceptable CPU performance is a reasonable compromise. 
At each step the dihedral angles that are rotated to 
create the trial structure are noted. Whether the 

10 trial configuration is accepted or rejected, the 

identity of the rotated dihedrals, the extent of the 
rotation, the value of the energy of the trial con- 
i i^p^ation and the new dihedral values if the trial 
conformation is accepted or the old dihedral values if 

1$ the trial conf orma t idft ife rejected are recorded to 
log file in temperature blocks. One log file is 
created for each run. A utility program inputs all of 
data and combines it according to temperature 
blocks. This data is output in comma delimited format 

20 so that it can be imported into deltagraph (Deltagraph 
TM version 1.0, Copyright Deltapoint, Inc., 200 
Heritage Harbor, Suite G, Monterey, CA 93940) which is 
iifce^ the conformational memories. Using 

s /y&iwther- utility program or manually, a temperature 

25 slice from the conformational memory is extracted. In 
the second phase of the simulation the sampling is done 
from a su performs the calculation shown 

in figure 3 instead of just using the standard random 
number generator. 

• 30 f : . ; ' ; Atv : the pufciiet . : of •; . the study we ; - wer^:- <f apppi;: :irith; -the . 

nearly intractable 14 -dimensional problem . • ^fee 
learhihg jphase of the simulation reveals that about (5jp* 
of the entire conformational space is unpopulated "dead 
zpne^" it 200K Going into the implementation 

35 the simulation, we were able to reduce the volume pf 

space that needed : ^o^,^^Bp^^it. by 
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many orders of magnitude . Additionally , since several 
dihedrals remained exclusively in a trans conformation 
at all temperatures throughout the many learning phase 
. . ■ simulations, we were also able to reduce the dimensiort- 
5 ality of the search in the implementation pha^ 

setting these dihedrals to a cphstaht value of 180 with 
. no loss of generality- .^rei^^^^Sme^t-S^ 

(simulated annealing with biased sampling) allows a 
■ considerable reduction in the conformational space 
10 which needs to be sampled. Hence, much larger systems, 
which have been generally considered computationally 
intractable, may now be studied. 

Having used this techhigue with much success in 
the area of conformation searching, we have begun 
15 exploring Smart-SA applications to quantitative 

problems such as free energy simulations. Preliminary 
results with sampling proportional to the height of the 
populated peaks has proven dependent upon the length of 
the learning phase. Only after getting quantitative 
20 convergence in the ratio of the populations are the 

final results invariant. This problem can be traced to 
a violation of detailed balance. Preliminary results 
of simulations sampling equally from all populated 
regions (with no sampling from "dead zones"), which 
25 should not violate detailed balance, have proven suc- 
■ cessful. 

j^ppi « r ta^itlf Ication of V+ fflBft JUWt*I» 

The key physiological role of the gonadotropin- 

30 releasing hormone ( [pGlul-His2-Ti^3-Ser4-Tyr5^^y6- 
Leu7-Arg8-Pro9-GlylO-NH 2 ] ;GnRH) as a mediator of 
neuroendocrine regulation in the mammalian reproductive 
system has made it the object of intense study for 
several decades. The ability of GhRH and its analogs 

35 to modulate the pituitary-gonadal axis has made them 
essential therapeutic agents in the treatment of a 
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variety of di to from iri^ 

prostatic carcinoma (Casper, Can Med Assoc J. 
160 (1991) ; B^rbiwiv Trends Endocrinol Met ab ' 'i*'?9 ^ 34) i 
Conformational studies have played a central role in 
5 the quest the stx^ictural basis for 

the activities of these peptides, as yell as in 
attempts to design new analogs with improved pharma^ 
coldgical propibi^ieiB ; investigations of the biological 
mechanisms tind^riying : the actions of the GnRH are 
10 quite difficult because small peptides are extremely 

flexible. Sfc^^ techniques/ for example, indi- 

cate that a multitude of interconvertirig conf brmers 
exist simultaneously. In an attempt to pare away some 
of tAe m^^ 

15 irrelevant conformations, several investigators have 
synthesized restricted GnRH analogs (Rizo et al. , J. 
Amer. Chem Soci, 114:2852 (1992) ; Bienstock, et al.; J. 
Med Chem., 36;3265 (1993)). This approach has proven 
useful for defining some structural motif s of anta- 
20 gonists of the hormone. Obtaining comprehensive 

detailed molecular conformational properties , however, 
such as the specif ic dihedral values of the bibactive 
forms, is a formidable task given that the GnRH has 35 
rot at able bonds as shown in figure 5; 
25 The inherent complexities of small flexible pep- 

tides has mqt&^ computational 
and experimental techniques (Young and Hicks, Biopoly., 
34:611 ( 1994) > ; Tliei computational method of choice is 
molecular dynamics. While dynamical techniques are 
30 capable of regaling 

motions, these methods are generally incapable of 
exploring the ensemble of conformational states that 
exist in flexible molecules (Guarnleri and still, J. 
Comp . Chem. ; i^ilioi (19$4 j ) . To explore the whole 
35 ensemble of conf ormational states that exist in the 

GnRH, we have ^ developed technique of 
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conformational memories (Wilson and Guamieri , . Tetra- 
hedron I«tt., 2k: 3601 (1991)) • Here we show that 
application of this technique can yield converged 

; ; |- ; p ; 6pul^i^^f y all : - 3 5 . rotatable bonds of the 
5 peptide. GnRH with no approximations. Samples from 
the conformational memories using the conformational 
memory biased sampling technique were used to charac- 
terize the conformational families of GnRH and several 
of its analogs, in an aqueous environment modeled with 

10 the generalized Born/surface area (GB/SA) method. 

(Still et al., J. Amer. Chem Soc. , 112: 6127 (1990) ) . 
This analysis reveals the conformational preferences of 
the GnRH and its analogs, and suggests some of the 
structural determinants for their biological function. 
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M ttfchod of c^gormational Memories 

The simulation technique of conformational .." . 
memories is a two stage process consisting of an 
exploratory phase and a biased sampling phase. In the 
exploratory phase repeated runs of Monte Carlo simu- 
lated annealing (MC/SA) (Kirkpatrick et al., Science, 
22fl:67l (1983)) are carried out in order to map wt the 
: ^^Sp^^^i^£:^^^ of **** flexible molecule^ 
The construction of Conformational memories described 
below has been interfaced with the MaCronodel (Mohamadi 
et al., J comput. Chem. 11: 440 (1990)) molecular 
modeling package version 5.0 so that the continuous 
GB/SA solvent model (Still et al., J. Amer. Chem. Soc. 
112 : 6127-6129 (1990) ) , and the recently developed amino 
acid backbone torsional potentials (McDonald et al., 
Tetrav Lett. 22:7743-7746 (1992) from the Macromodel 
package could be used in the present conformational 
study. As applied here, the MC/SA protocol for the 
exploratory phase was designed with a starting tempera- 
35 ture of 2070 arid a cooling schedule of T n+1 »0.9*T n for 
nineteen discrete temperature points. At each tempera- 
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t^re : 10>000; steps were applied to the 35 rotatable 
bonds |^ig 5i) cooling the system to a filial tempera- 
ture of 31 OK; Trial conformations in the MC/SA routine 
were generated by randomly picking 2 rotatable bonds 
5 from among the 35, rotating each bond by a random value 
rte&±e&+rf^B^V degrees ,. arid accepting or rejecting the 
t r i a 1 coii forma t ion according to the standard Metropolis 
(Metropolis et al., J. dhem Phy s . # 21 : 187 (1953) ) cri- 
teria with a Boltzmann probability function def infed: at 

10 the g^ After each step, whether the 

conformation was accepted or rejected; the data 
^^^^^^^^^^ the extent of rotation, the energy , and 
the value of the dihedral angles are recorded to a "log 
file". An example of the output to the log file is 

15 };ii0^ In this example, the first group of 

entries; Corresponding to the first two lines, is ihe 
result of a rejected step as indicated by the zeros in 
the first column. The second and third columns iden- 
tify the atom numbers of the bonds that were rotated to 

20 create the trial move (in this example atom40-atom41 

and atom47^tbm48) . The fourth column lists the extent 
pf rotati^ torsion angle in degrees. The fifth 

column lists the total energy of the structure . Tlys 
sixth column holds the current dihedral value rif the 

25 bond. 

The second group of entries in Table 1, corres- 
ponding to the next two lines, lists the results of a 
^ accepted as a new conformation 

( a£ i^ by the digit one in the first column) . 

30 The current dihedral values given in the last column 
are the new values of the newly accepted conformation . 

; :' ; x-Eich ; ',i^ :: ;6f • MC/SA consists of a random walk of 
190> 000 steps (19 temperatures, 10,000 steps per 
temperature ) . Because two lines of data are added to 

35 the log file for each Monte carlo step, a single run 
creates a f ile of 380 ^ 000 lines . To explore the 
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corif ornatiqns of GnRH peptide in GB/S A water, we 
performed 157 of these simulations , creating log tiles 
of the different random walks. A 157 run MC/SA simula- 
tion retires/ about 12 diys of computation on an SGI 
5 Challenge 200 MH3 workstation. 

To obtain structural information from this large 
amount of data the log files are used as input to a 
program (called Flex) that sorts, merges, and compacts 
the data in several ways. Since the simulations were 
10 done at 19 temperatures for each peptide, application 
of Flex first sorts and merges the data from all log 
files into 19 temperature blocks. Subsequently, within 
each temperature block, the data are partitioned into 

35 bond blocks, one for each rotatable bond . For each 
15 rotatable bond, the dihedral angle space is partitioned 

into 36 ten degree intervals. From each line of data 
for a given bond at a given temperature, the program 
records the number of times that the bond dihedral 
angle value belongs to one of the ten degree buckets, 
20 i.e. a "Conformational Memory". Finally, the Flex 

program produces a 19x36 (recording 19 temperatures by 

36 10-degree diheral intervals with normalized popula- 
tions) spread sheet for each of the 35 rotatable bonds 
of the GnRH peptide. An excerpt of one of these spread 

25 sheets is given in Table 2. The spreadsheets are 
imported into Delagraph (TM Version 1.0, Copyright 
beltapoint, Inc. 200 Heritage Harbor, Suite G, 
Monterey, CA 93940, (1987)) for plotting and graphical 
representation of the data in the spreadsheets are 

30 given in Figures 6 A-D. Across the top of the spread- 
sheet are the dihedral angle values from -170° to 180? 
which label the y-axes of Figures 6 A-D (note that the 
spreadsheet fragment is cut off at -100);, In the first 
column are the 19 temperatures which range from 2070 to 

35 3i0 which label the x-axes in Figures 6 A-D. The value 
in a spread sheet position corresponding to a given 
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t^ifeiratuirfe with at giveii t^ dihedral bucket; is 

the population per cent atge which is plotted on the z- 
axes ..':'a^ 

■ : ;^ ; -"-Tha procedure foi: creating conformational memories 
5 f b!^ ^e di^ in an enormous compres- 

sion of the large volume of data needed to describe a 
35 dimensional hypert or sional space. The condensation 
of 1Ae information in Delt;agx:aph plots yields identi- 
fiable structural motifs. For example, bond 4 shown in 

10 figure 6a , has a classic three state distribution: 

trans, gauche+ and gauche-; bond 6, the phi angle of 
residue 3 (Fig. 6b) ,• has a continuous population dis- 
tributibh oyer a very large range from iabout -60 to - 
180 degrees and no jpbpulatipn in the other regions at 

15 amy temperature. In contrast, bond 7, the psi angle of 
residue 3 (Fig. 6c) , has a narrow all trans distri- 
bution. The distribution of bond 35 (Fig. 6d) , favors 
a trans conformation, but maintains significant popu- 
lation over the Entire dihedral circle at all tern- 

20 peratures. The construction of conf ormational memories 
has been interfaced with the Macromodel molecular 
modeling package ( Mohamad i et al. , J. Comp. Chem. , 
11:440 (1990)) version 5.0 so that the continuum GB/SA 
solvent models and the recently developed amino acid 

25 backbone torsional potentials (McDonald and Still, 
Tetra. Lett., 31:7743 (1992)) from the Macromodel 
package could be used in the present conformational 
study. 

30 Converge nce of Conformational Memories 

By inclining the reisiilts from multiple 
explorations of all possible cpiibinations of dihedral 
angle values for all rbtatable bonds of the molecule, 
the thirty-five conf drmatiohal memories provide a cbm- 

35 plete mapping of the cip^ space of GnRH Wii^h 

no approximations, as long as the calculated popular 
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tiohs are converged. In the original formulation of 
the method, population convergence was identified as 
the difficult and crucial aspect of forming conforma- 
tional memories (Wilson and Guarnieri, Tetrahedron 
5 Lett . , 22.i 3601 (1991) . Because the second phase of the 
simulation, the biased sampling, explores only the 
parts of the conformational space identified as popu- 
lated regions, Population convergence ensures that 
regions that could be thermally accessible are not 
10 erroneously labeled as being unpopulated. The correct 
identification of the populated regions is essential 
for the second phase of the simulation , because the 
biased sampling only explores populated regions of the 

conf ormat iona 1 space . 
15 Population convergence for the GnRH was confirmed 

in three different ways: by creating conformational 
memory difference maps for simulations of different 
length, by analyzing intrinsic symmetry; and by showing 
that there is no significant difference in the popula- 
20 tions of actual structures of GnRH created from 

Conformational Memories obtained from 25, 50, 75, 100 
and 157 independent MC/SA runs. Figure 3 shows the 
Conformational Memory difference maps for dihedral 
angle I,! comparing simulation lengths of 10, 25, 50, 75 
25 and 100 runs. The difference map in Figure 3a is 
created by subtracting the Conformational Memory 
obtained from a 25 run MC/SA simulation from a 10 run 
MC/SA simulation, in Figure 3b the difference is 
between 50 and 25 runs, Figure 3c shows the difference 
30 between 75 and 50 runs, and Figure 3d is the difference 
map between 100 and 75 runs . The progression clearly 
shows the convergence. The other dihedral angles have 
very similar difference maps for this sequence of 
comparisons i 

35 a second measure of convergence is symmetry. 

Because dihedral angle 17 has a 2^fold axis of sym- 
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metry, It is exp^c£<^ the diliedral space of thM 

bond wi 1 i have isy^etr ic pdpulatipri distribtitions cen- 
■ tered at -90 and 90 degrees. >A : ::^^^^ : - 
31 OK of this £^ MC/SA 
5 simulations isshowri in Figure 8. The population^ 
tributiohs 
tionsi 

The third indication of convergence is the finding 
(see below) that biased sampling from Conformational 
10 Memories created from 25, 50, 75, 100 and 157 MC/SA 
runs yield very similar profiles of GnRH. 

Biased Sampling From Conformational Memories : 
Elimination of Barriers 

' 15 ■ 

Once the conformational memories are established, 
a new Monte Carlo search is performed at 309K, sampling 
only from the populated regions. Because about 50% of 
the torsional space of the 35 bonds is populated at 
20 310K, so that the conformation space that needs to be 

explored in the biased sampling phase of the simulation 
has been reduc^ 

of magnitude; Table 3 is an excerpt :gt i^^i^E^0i^l%K 
matrix for GnRH at 310K. The dimension of this proba- 
25 billty matrix is 35 for the 35 rotat able bonds 
partitioned into 36 buckets over the 360 degree 
dihedral space (notfe ^tSsa^^«iaiy : 11 bf the 36 dihedral 
buckets and bn^ 
angles are sho^ii in Table 3| . The f irist line indicates 

30 that at 310K ':;&nfc 

dihedral interval 10.1% of the time . In contfastv the 
seventh column '■■^t>'^^:'--t^§p row indicates that bpnd i 
is never found in the dihedral interval -120 to -110 at 
• 310K. 

35 The tiro stage process of developing Corif bxrmat ipnal 

Memories and then performing the biased sampling fr<ra 
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these distributions is necessary in order to sample the 
entire conformation space of the molecule. An 
obviously simpler alternative would be to limit the 
conformational exploration to standard Metropolis Monte 
Carlo at 310K and monitor the development Of th^randbm 
walk over torsional space. However, this simulation 
constitutes the last step in the development of the 
Conformational Memories for the temperature of 310R; it 
is clearly inadequate; as indicated by the acceptance 
rate. The acceptance rate is about 28% at 207OK, with 
a step size chosen randomly within the interval of +/- 
180 degrees and rotating two dihedrals selected 
randomly at each step. At 310K, using the same para- 
meters, the acceptance rate falls below 2%. Therefore; 
15 the sampling of the 35 dimensional dihedral space would 
be incomplete if these parameters Were used for the 
Monte carol random walk procedure at 310K. Evert if the 
random interval from which trial configurations are 
sampled were reduced to +/-30 degrees (to increase the 
20 acceptance rate), sampling would still be insufficient 
because the majority of hew conformations would be in 
'' the local area of the previous conformation. The +/- 
180 degree step size was deliberately chosen so that 
new conformations can be created by jumping between 
25 wells without having to climb over barriers. A single 
simulated annealing run cannot be expected to cover 
such a vast space, but cumulations of multiple runs 
while each of the runs performs a different random walk 
can be shown to converge, as illustrated in Figures 7 



3d and Figure 8. 

The restriction ^>f the sampling to the populated 
regions identified in the previous step (i. e , , / the 
conformational Memories) is achieved by partitioning 
the 0-1 interval of the rahdipm number generator into 

35 the 36 parts which correspond to the 36 separate 10- 
degree intervals for each rotatable dihedral angle. 
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Th^ part itlbning of ^ is 
proportional to the population of the 10-degree bucket. 
New biased trial generated by 

randomly dhb^ bonds, generating a new 

5 random number for each: bond/ determining to which of 
the 36 intervals each new random number for each bond 
belongs, and driving the dihedrals to the appropriate 
intervals. The exact value of the new dihedral is 
determined by a linear interpolation. This procedure 
10 is illustrated in Figure 9. 

A major advantage of the Conformational Memory 
biased sampling method is that partitioning the random 
number generator among the populated intervals results 
in a sampling technique that eliminates the barrier-^ 
15 crossing problem. bw biased sampling random 

walk, a new trial cohfiguration is sampled from the 
Conformational of the 

populated dlli^ral space, ahd then the trial conforma- 
tion is created by driving the current structure to the 
20 appropriate configuration. Hence, the notion of a 

barrier restricting access to any part of the conforma*- 
tional space i;s eliminated in this procedure . Because 
Conformational Memories are mean field population dis- 
tributions, the Correlations among the different 
25 flexible torsional a^ in the 

averaging process; Nevertheless^ the Conformational 
Memory biased sapling tecauiic^ 

bring together the higher probability regions of the 
different dihedrals. Thue, the method introduces 

30 average correlations among the d i f f er en t d ihedr a 1 

angles during the selection prbceiss, while accessing 
all populated r^ionis • • it is important to note that 
the original formulation of the Conformational Memories 
biased sampling technique (Guarnier i aind Wilson , J. 

35 Comput. Chem £6: 648-653 ( 1995) ) violates detailed 

balance. Here , we have corrected the biased sampling 
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so that it obeys deta iled balance by multiplying the 
Roltzmanri function ais^ Ji^^the Metropolis test with the 
factor Ploid*P2©^ > -where- Plold . and PSold • , 

: )-xte the pppuiatibn percentages of the ten degree inter- .. 
-V$:-: : -'^Xs^f. the Conformational Memories of the two dihedral 
:j jangles iri the current conformation of the random walk 
(because in this example two dihedrals per step are 
changed). Plnew and P2neW are the corresponding 
population percentages of the new dihedral values for 
10 these angles in the new trial conformation. 

p^yglQpmei ) *- of Conformational Families 

We performed several sequences of biased sampling 
runs at 310K to determine the best and simplest Way to 
15 create representative conformational families for the 
GnRH peptide. The first run was a 10,000 step MC 
random walk using the Conformational memory biased 
sampling technique with uniform sampling of 100 struc- 
tures (1 sample every 100 steps). The second run was a 
20 50,000 step Mc random walk using the Conformational 

Memory biased sampling technique with uniform sampling 
of 100 structures (1 sample every 500 steps). The 
third and fourth runs were 100,000 and 500,000 step 
biased sampling runs also sampling 100 structures in 
25 the same manner . Each batch of iOO structures was 
analyzed with the program XCluster (Shenkin and 
McDonald , J ^ Cpmput ^O^M^^'i ) . xCluster 
inputs the series of 100 conformations and; computes the 
IMS different conforma- 
30 tions. Structures 2-100 of the input sequence are then 
reordered based on increasing RMS deviation- In the 
new ordering , considering all 100 conformations , con- 
former 2 has the smallest P^S^ conformer 
1, and conformer 3 has the smallest RMS deviation from 
35 conformer 2, etc. Xcluster then produces a graphical 
representation of the RMS deviations between every pair 
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of bonfoxrmers. Since the conformations have been 
rearranged sb that the PMS deviation between nearest 
neighbors ii any large jump in RMS deviation 

between heareist neighbors is indicative of a large 
5 structural change and hence ident if ies a new cpnr- 

forinatibhai family. As described ^ we settled oh 
500,000 steps for the subsequent biased sampling runs. 
We then performed these biased sampling runs using 
Conformational Memories created from 25, 50, 75, 100 
10 and 157 run MC/SA simulations. 

III. Results And Discussion 
conformational Families of GnRH 

The 500,000 step biaised sampling runs for GnRH 
±5 with a sampling rate of 1 every 5,000 structures 

require 4 ; 3 libuirs pei; run on ^ 2^0 itez SGI Challenge 
wdrkstatipn^^^ 

sampling run were clustered in conformational families 
as described I above. A backbone trace of represen- 

20 tatives from the 5 families with very distinct backbone 
conformations that emerged from this procedure is shown 
in Figure lo • obtained 
regardless of the origin of the Conformational Memories 
from MC/SA simulations of 25, 50, 75, 100 or 157 runs. 

25 Families of conformations having a beta-turn between 
residues 5r8i occur with a frequency of approximately 
70%. A distribution showing a super imposition of 70 of 
. ••• these -straqtu^ illia^trated . in , ; Fi^ire 11 ■ (GnRH is ■■ ■/ 

colored in red, with *r<3* colored in greeth) . The beta- 

30 type turn common to all the structures in this family 
is clearly (evident (Fig 11) . In contrast , families 
which have an extended backbone, occur with a frequency 
bf about 5% . The distribution of side chain bri^n^ 
titib^ is wider 

35 than that of any other residues in GnRH. The results 
of Struthers et al , (Prbteins : Structure, Function and 
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; , Genetics fi: 2^5-304 (1990) ) from the ^x^iih^tipn of : 
different GnRH analogs se^ to indicate t^ . 
arginlne is required as part of the pharmacophore. The 
present results, on the other hand, may indicate that 
5 the role of Arg8 in the receptor interaction of GhRH 
could relate to tne backbone conformation, rather than 
to its participation in a recognition pharmacophore* 

It is noteworthy that biased sampling runs of 
10 ; 000-25 , 000 steps resulted in large (uncoriverged) 
10 fluctuations in the ratio of beta-turn to extended 

backbone conformations. However, more extended biased 
sampling runs of 100,000-500,000 resulted in negligible 
fluctuations in the ratio of beta-turn to extended 
backbone conformations. Although it appeared from our 
15 calibration studies that 100,000 step biased sampling 
runs are sufficient, we chose to carry out the more 
extensive 500,000 step biased sampling runs for all the 
Calculations presented here. 

20 nonfoTTnatiWl TamiliBS of Lvs8-QnRH 

The Lys8 analog of GhRH had been constructed to 

explore the role; o^ 

GnRH by its receptor (Karten and Rivier, Endo. Rev. 
2:44-66 (1986); Millar et al. , J. Biolog. Chem 

25 264:21007-21013 (1989)). Mutation studies of GnRH 

receptors from various species have implicated Arg8 as 
being important for mammalian hormone-receptor recogni- 
tion (Flanagan etal., J. Biolog. Chem . Z&i 22636-22641 
(1994)). To analyze the structural implications of 

30 Arg8 for the activity of GnRH, we compared the 

conformational prof iie of the peptWe hormone? with that 
of the mutant Lys8-GnRH which is known to be a low 
affinity GnRH agonist. In contrast to the wild type 
hormone, the 

35 GhRH congener was found to have an extended backbone, 

while the beta-turn conformation exists as a very minor 
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family. A backbone trace of a representative of each 
family is shown in Figures 12; The; family of conforma- 
tions represented in Figure 12a has an extended 
backbone and occurs with a f re^ of greater than 
5 70%. The Lys8-GnRH family that has a beta-type turn 
conformation of the backbone ^Figure 12b) yhich is 
Virtually identical to the major conformational family 
of the GnRH ( Figure 10a) : , has a probability of only 
about 3%. A distribution of the members of the 

10 predominant Lys8 -GnRH family superimposed upon each 

other is shown in Figure 13 , with the entire molecule 
shown in red, except for Lys8 which is colored green. 
Because the Lys8-GnRH has a low affinity for the GnRH 
receptor, but elicits the same response once it 

15 interacts with the receptor, it is tempting to suggest 
that adoption of a large population of beta-type turn 
conformation is a key requirement for hormone-receptor 
recognition. This inference agrees with earlier pro- 
posals in the literature, and is supported by results 

20 from additional Conformational Memories simulations on 
the structural characterization of eight other GnRH 
analogs that exhibit dW 

the beta-tiurn like structures and the fully extended 
conformations of tliei backbone (Guar e t a 1 . , 

25 unpublished results) . It is particularly noteworthy 
that our simulations lead to the same conclusions 
regarding the importance of that 
were drawn from their confined ^ and molecular 
dynamics studies of conf orma^ constrained GnRH 

30 analogs ( Struthers <et al. , Proteins: Structure , 
Function and Genetics fi: 295-304 ( 1990) . 

structura l Comparison To a Constrained GnRH Analog 
To test the key inference from the present 
35 simulations of GnRH analogs , regarding the correlation 
between the population of beta-type turn structure and 
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affinity for receptor, we compared several samples from 
the most populated conformational family of GnRH 
obtained from Conformational Memories to a structurally 
constrained cyclic decapeptide GnRH analog (Baniak et 
5 al., Biochem 2£: 2642-2656 (1987)). The conformation of 
this cyclic decapeptide was determined from NOE data 
using 2D NMR techniques (Baniak et al., Biochem 
2£: 2642-2656 (1987)). These experimental studies con- 
cluded that residues 6 and 7 formed a type II beta-turn 
10 and residues 1 and 2 formed a type II beta-turn. Addi- 
tionally, it was concluded that a weak hydrogen bond 
existed between the Arg8 -NH and the Tyr5 -CO, and a 
stronger hydrogen bond between the D-Trp3-NH and the 
beta-AlalO -CO. To allow for the comparison, a struc- 
15 ture of this GnRH analog was built in Macromodel 4.5 

according to the specifications (Baniak et al., Biochem 
2_6_: 2642-2656 (1987)), and using the beta-turn defini- 
tions of Hutchinson and Thornton (Hutchinson and 
Thornton, Protein Science 2:2207-2216 (1994)). This 
20 reconstructed GnRH analog was compared to the GnRH 
structures obtained from the conformational Memories 

described above. 

Several of the members of the major conformational 
family of the GnRH obtained from the Conformational 

25 Memories were selected at random and superimposed on 
the reconstructed geometry of the analog, using the ll 
backbone atoms from the Tyr5 -CO to the -N of Pro9. 
All computationally derived structures superimposed on 
the reconstructed structure with RMS deviation in a 

30 range of 0.6-0.8 A, An illustration of the super- 
imposition is shown in Figure 14. Clearly, the 
computationally derived structure is closely related to 
the reconstructed backbone of residues 5-8 of the 
experimentally derived peptide structure. The struc- 

35 tures diverge between the N-terminus and residue 4, 
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and a superimposition of all backbone atoms results in 
a 5 A RMS deviifeion. 

5 i Recently, (Nikif Cfrovich arid Ma J. 

Peptide Protein Res, 12:171-180 (1993)) constructed low 
energy conformations of GnRH using the ECEPP program 
(Dunfield et al. , J. Phys. Chem. £2:2609-2616 (1978) )> 
We have recbristiract^: ^ight conformations from the pub- 
10 lished list of backbone dihedral angles and a list of 

side chain dihedrals graciously provided by the authors 
(Nikif orovich and Marshall, Int. J. Peptide Protein 
Resv \42117 l-rl80 ( 1993) ) y The energies of these recon- 
structed pcipt ide str^ctiires were compared with 
15 representative^ the thre ma j or families of GnRH 

found us ing Cbnf ormat iona 1 Memor ies , The optimal 
g from the two computational 

methods were quite different, and the energies of the 
-height, corif orations calculated from ECEPP were 300-400 
20 kJ/mbl (20-25%) higher than those calculated from the 
conformations generated using Conformational Memories. 
It is unl ikely that this large difference can be attri- 
buted mo^eti^^^^M use bf different force f ields in 
the definition of optimal conformations, since a recent 
25 comparative study resulted in yery similar low energy 
Met-Enkephalin structures (Montcalm et al. # J. of Mol. 
Str. (Theochem) 308:37-51 ( 1994) ) . However , a major 
source of difference may be the use of a GB/SA water 
model in the Corif ormational Memories approach , and 
30 perhaps a more complete exploration of the conforma- 
tional space. 

Exploration of the Unpopulated Regions 

As a stringent ^ the completeness of the 

35 conformational exploration, we performed extensive 
sampling from the Unpopulated regions of the 



(^fbi^ational Memories for several key dihedral angles 
inVolved in the formation of the beta-turn of the GnRH. 
•With one exception, this sampling produced high energy 
. " structures • in • all ; : case#,-; ; a^: j: expected v. The one 
5 interesting exception occurred during the sampling of s • 
^e unpopulated regions of the phi angle of Gly6. This 
sampling produced a structure only 20 IcJ/mol higher |n ; ; 
eh#rgy than the best GnRH structure. The dihedral 
W:mt$>& came from a bin that had a 0.6% population at 
10 345K/ but had a 0% population at 310K and therefore was 
not included in the populated portion from which the MC 
biased sampling was done. A simple way to avoid 
missing a very low probability low energy structure 
when performing the biased sampling at 3 10K, is to use 
15 the probability weights from a higher temperature, pur 
exploration of the unpopulated regions of the phi angle 
pf Gly6 at temperatures 100^200K above 3 10K eliminated 
this problem. The small drawback, however, is that 44% 
of the dihedral space of Gly6 is unpopulated at 310K 
20 and only 33% is unpopulated at 473K. Thus, a safety 
factor during the biased sampling run involves 
exploring about 10% more dihedral space per rotatable 
torsion, but ensures the enclosure of all populated 
areas. Conformational regions that exhibit 0% 
25 population in the calculation of the isolated peptide 
in water at 310K may still be of biological importance, 
if some of these conformations can be induced by the 
interaction energies of the peptide with the receptor . 
The finding that regions unpopulated at 310K are in 
30 fact populated at temperatures higher by only 100K 
(corresponding to an energy difference of only a 
fraction of a Kcal/mol) , indicates the feasibility of 
such «receptor-induced M conformations. 
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Applied to the decapeptide^ 
method of Conformational^ 

a powerful practical solution to the complex problem 
presented by the flexibility of polypeptides with a 

5 large number of conformational degrees of freedom. 

With the studiy of the f lexible decapept id^ , the method 
was shown tb be capable of achieving complete sampling 
of the conformational space, to converge in a very 
practical number^ 

itp cbiaihg ,ene£gy.-;' J^rriers efficiently^ 

The results of the co^ a 
relation between the beta-turn structure identified as 
the major conformational family of GnRH, and high 
affinity for the GnRH receptor. While these inferences 

15 were inherent in the results for earlier investigations 
of confbrmationally restricted GnRH analogs, the 
present study provides unbiased support for this 
mechanistic hypothesis based on a complete exploration 
of the conformational space of the peptide hormone 

id Itself and its unconstra ined congeners . Because the 
method seems to have produced th£ lowest eh^ 
cbnformers reported for GnRH from a full 
that is economical and practical /its geher il app 1 i - 
cation to the study of peptide structure-function 

25 relations should continue to prbduce importiemt jnechan-^ 
isitic insights and powerful iguides f or; ligand design . 

A sample of the output collected in the; history 
files of the simulated annealing random walks . Column 

3p 1 indicates if the data is produced from an accepted or 
rejected step with 0«re j ected and l=accepted . The 
second column lists the pair of atom number identifying 
the dihedral angles that were rotated to produce the 
trial structure. The third column lists the extent to 

35 which the dihedral was rotated in order to create the 

trial structure. The fourth column lists the energy of 
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W'i "^Vcju^i^ .eonforniation (the energy of the original 
^restructure if rejected^ or the hew structure if 

accepted) . The fifth column lists the current dihedral 
values of the conformation (the dihedral angle: of the 
5 original structure if rejected or the new structure if 

; '- ' : ;'?f{^a^cepted.)--«'' 
Table 2 * 

A sample of a conformational memory spreadsheet. 
The first row labels the dihedral circle across the y- 

10 axis. The first column labels the temperatures across 
the x-axis. Each cell contains the population corres- 
ponding to a given temperature and a given 10 degree 
dihedral bucket which is plotted on the z-axis. Note 

[y^Jtiiat the columns of the spreadsheet are cut off after - 

15 40 : degrees. i 
Table 3 

Excerpt from the population probability matrix for 
GrtRH at 310K. The dimensions of this matrix are 35X36 
(35 rotatable dihedral angles, with the population 
20 distribution of each angle broken into 36 intervals of 
10 degrees) . Note that only 14 rows and 11 columns of 
this matrix are shown in the Table. 

various publications are cited herein, the con- 
tents of which are hereby incorporated by reference in 
25 their entireties. 
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CLAIMS 

1. A method for identifying structurally active 
molecule comprising/ 

& (a) performing ^ runs 
5 :: ' in order to reveal populated and unpopulated regions of 
multidimensional conformation space; and # 

(b) performing a simulation at a fixed temper- 
ature , with sampling only from po^ regions found 
?;-; ; -- : -: in the first step- 
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